Developing an Electroencephalography-Based Model for Predicting Response to Antidepressant Medication

This prognostic study explores the use of an electroencephalography (EEG)–based model to predict response to specific selective serotonin reuptake inhibitor treatments for major depressive disorder.

) high suicidal risk, defined by clinician judgment; (5) substance dependence/abuse in the past 6 months; (6) presence of significant neurological disorders, head trauma or other unstable medical conditions; (7) pregnant or breastfeeding; (8) failure of 3 or more adequate pharmacologic interventions (as determined by the Antidepressant Treatment History Form); (9) started psychological treatment within the past 3 months with the intent of continuing treatment; (10) patients who have previously failed escitalopram or showed intolerance to escitalopram and patients at risk for hypomanic switch (i.e. with a history of antidepressant hypomania).(7) meeting DSM-IV criteria for substance abuse in the last 2 months or substance dependence in the last 6 months (except for nicotine); (8) require immediate hospitalization for psychiatric disorder; (9) have an unstable general medical condition (GMC) that will likely require hospitalization or to be deemed terminal (life expectancy < 6 months after study entry); (10) require medications for their GMCs that contraindicate any study medication; (11) have epilepsy or other conditions requiring an anticonvulsant; (12) receiving or have received during the index episode vagus nerve stimulation, ECT, or rTMS, or other somatic antidepressant treatments; (13) currently taking any of the following exclusionary medications: antipsychotic medications, anticonvulsant medications, mood stabilizers, central nervous system stimulants, daily use of benzodiazepines or hypnotics, or antidepressant medication used for the treatment of depression or other purposes such as smoking cessation, since these agents may interfere with the testing of the major hypotheses under study.Non excluded concomitant medications are acceptable as long as their clinician determines that antidepressant treatment is safe and appropriate; (14) significant liver disease that would contraindicate any study medication; (15) taking thyroid medication for hypothyroidism may be included only if they have been stable on the thyroid medication for 3 months; (16) using agents that are potential augmenting agents (e.g., T3 in the absence of thyroid disease, SAMe, St. John's Wort, lithium, buspirone, Omega 3 fatty acids); (17) therapy that is depression specific, such as CBT or Interpersonal Psychotherapy of Depression (IPT) is not allowed during participation (participants can participate if they are receiving psychotherapy that is not targeting the symptoms of depression, such as supportive therapy, marital therapy); (18) currently actively suicidal or considered a high suicide risk; (19) are currently enrolled in another study, and participation in that study contraindicates participation in the EMBARC study; (20) any reason not listed herein yet, determined by the site PI, medical personnel, or designee that constitutes good clinical practice and that would in the opinion of the site PI, medical personnel, or designee make participation in the study hazardous.
Treatment: Participants were randomized to an 8-week course of sertraline or placebo using a double-blind design.Sertraline was started at 50 mg daily and could be increased to 200 mg daily for non-responders.The same dosing schedule was applied to placebo treatment.Using the Clinical Global Improvement scale (CGI), participants were identified as responders (>= 50% improvement on the CGI from baseline to week 8) to their respective treatment at week 8.
Responding participants continued to receive either sertraline or placebo for a further 8 weeks; non-responding participants to sertraline at week 8 received an augmented treatment with bupropion (150-450 mg daily), and non-responding participants to placebo were switched to sertraline.The primary outcome measure was the Hamilton Depression Rating Scale (HDRS-17) and was assessed every week during the trial.

eAppendix 3. EEG Data Standardization
CAN-BIND: EEG data were standardized according to the following processing steps: selection of 32 common electrodes, resampling to 500Hz, re-referencing to the average reference.In addition, only the first 5-minutes of the recording in the eyes-closed conditions was used.The 32 electrodes, 500Hz sampling rate, and 5-minutes duration data were chosen to match the new EEG device and recording procedures that will be used in future studies at CAN-BIND.
EMBARC: A similar standardization was performed, except for the sampling rate set to 250Hz (see eTable 1).The duration of the recording also slightly differed, as only a total of 4-minutes with the eye-closed condition was recorded in the EMBARC study.

eAppendix 4. EEG Data Processing
A customized, fully automated pipeline was used to remove artifacts from the data.This pipeline was build using EEGLAB plugin functions and an adapted ERPEEG toolbox 4 .If needed, data were first resampled to the right sampling rate.A second-order Butterworth IIR filter was then applied to high-pass filter data at 1 Hz.In a third step, bad channels were deleted using the EEGLAB plugin clean_rawdata, and dad segments are corrected using Artifact Subspace Reconstruction algorithm 5 .ZapLine method was then employed to eliminated power line artifacts 6 .Independent component analysis was conducted to decompose data into independent components in step five.Components associated with eye movement, eyes blinks, muscle, and cardiac artifacts were then removed using the ICLabel algorithm 7 .Finally, channels previously deleted were then reconstructed with spherical interpolation, and data were re-referenced to average.

eAppendix 5. EEG Features Reduction
To reduce the number of EEG features, the 32 electrodes were grouped into 14 brain regions (bilateral frontal, temporal, central, parietal, and occipital regions, and midline frontal, central, parietal and occipital regions, eFigure 1) and features were averaged within each region.
Asymmetry features were also computed by dividing features in the left hemisphere by features in the right hemisphere for 5 pairs of brain regions.In total, we had 152 features: 95 spectral hyperparameters are tuned in the inner loop.This ensure that the test part of each fold in the outer loop is not used for optimization of the hyperparameters.In this approach, no final model is built as a different set of optimal hyperparameters might be obtained in each fold.However, this procedure is necessary to get an unbiased estimate of the model performance when no independent dataset is available 9 .

eAppendix 7. Nested K-Fold Cross-Validation
To reduce concerns related to overfitting, we conducted a comparison between nested leave-oneout cross-validation and nested k-fold cross-validation 10 .In the outer loop, we experimented with different values of k, such as 5, 10, 20, 30, and 40.Similarly, the inner loop comprised kfold iterations, matching the number chosen for k in each case.To address the potential variability introduced by the random partitioning in k-fold cross-validation, we repeated the entire process 10 times.This approach aimed to reduce the impact of chance variations and provide more robust estimations of model performance.To test whether the difference in symptoms severity between participants in CAN-BIND and EMBARC could have an impact on the generalizability of these results, we conducted an exploratory analysis by excluding participants in EMBARC who had low baseline HDRS scores.
We decided to exclude participants in EMBARC who had a HDRS scores equal to or below 16

eAppendix 1 .
Participant Samples CAN-BIND: In the CAN-BIND 1 study 1,2 , participants were required at six sites in Canada: University of Calgary (UCA), University of British Columbia (UBC), McMaster University (MAC), Queen's University (QNS), Toronto General Hospital (TGH), and Centre for Addiction and Mental Health (CAMH).All participants provided informed written consent.The study protocol was approved by Research Ethics Boards of participating institutions.It was in accordance with the Declaration of Helsinki and was registered with clinicaltrials.gov(https://clinicaltrials.gov/ct2/show/NCT01655706).Eligibility: Participants were required to be between 18 to 60 years old and meet the following inclusion criteria: (1) meet Diagnostic and Statistical Manual of Mental Disorders (DSM-IV-TR) criteria for a Major Depressive Episode in Major Depressive Disorder by the Mini International Neuropsychiatric Interview (MINI); (2) episode duration of at least 3 months; (3) free of psychotropic medications for at least 5 half-lives before baseline visit (4) Montgomery-Åsberg Depression Rating Scale (MADRS) score ≥ 24; (5) fluency in English.Study exclusion criteria included: (1) any Axis I diagnosis other than MDD that is considered the primary diagnosis; (2) Bipolar I or Bipolar II diagnosis; (3) presence of a significant Axis II diagnosis (borderline, antisocial); (

Treatment:
During this 16-week trial, participants received escitalopram 10-20 mg during the first 8 weeks in an open-label design.The dose could be reduced back to 10 mg if not tolerated.At week 8, participants were identified as responders to escitalopram based on a reduction in MADRS score of ≥ 50% from baseline to week 8.This group continued to receive escitalopram for the remaining 8 weeks, while non-responders at week 8 received aripiprazole (2-10 mg daily) as an add-on treatment to escitalopram.The primary outcome measure was change in MADRS, from baseline to 8 and 16 weeks.EMBARC: In the EMBARC study 3 , four sites in USA recruited participants: Massachusetts General (MG) Hospital, University of Michigan (UM) Ann Arbor, Columbia University (CU), University of Texas (TX) Southwestern Medical Center.Written informed consents were obtained for all participants.The study protocol was approved by Research Ethics Boards of participating institutions.It was in accordance with the Declaration of Helsinki and was registered with clinicaltrials.gov(https://clinicaltrials.gov/ct2/show/NCT01407094).Eligibility: Participants had to between 18 to 65 years to be enrolled in the study.Inclusion criteria were: (1) outpatients with a current primary diagnosis of nonpsychotic recurrent or chronic MDD per the Structured Clinical Interview for DSM Disorders (SCID-I); (2) Quick Inventory of Depressive Symptomatology -Self Report (QIDS-SR) score of 14 or higher at screening visit and randomization (baseline) visit; (3) no failed antidepressant trials at adequate dose and duration, as defined by the Massachusetts General Hospital Antidepressant Treatment Response Questionnaire (MGH-ATRQ), in the current episode; (4) agrees to, and is eligible for, all biomarker procedures (EEG/psychological testing, magnetic resonance imaging (MRI), and blood draws).Study exclusion criteria included: (1) history of inadequate response (to trials at adequate dose for adequate duration) or poor tolerability to sertraline (SERT) or bupropion (BUP); (2) pregnant or breastfeeding; (3) plan to become pregnant over the ensuing 12 months following study entry or are sexually active and not using adequate contraception; (4) history (lifetime) of psychotic depression, schizophrenia, bipolar (I, II, or NOS) disorder, schizoaffective disorder, or other; (5) Axis I psychotic disorder; (6) current primary anxiety disorder diagnosis;

eAppendix 2 .
EEG Data Recording CAN-BIND: EEG data were collected at 4 sites (CAMH, TGH, QNS and UBC).Eight minutes of EEG activity were recorded from all participants during the eyes-closed resting condition at baseline.EMBARC: All 4 sites recorded resting-state EEG during four 2-min blocks (two blocks for eyes-open and two blocks for eye-closed) at baseline.

©eFigure 1 .
2023 Schwartzmann B et al.JAMA Network Open.features (5 frequency bands x 14 brain regions + 5 frequency bands x 5 brain region pairs) and 57 multiscale entropy features (3 timescale bands x 14 brain regions + 3 timescale bands x 5 brain region pairs).Division of the Brain Regions and Their Included Electrodes Frontal left (FP1, AF3 and F3), Frontal midline (FZ), Frontal right (FP2, AF4 and F4), Temporal left (F7, T7 and P7), Temporal right (F8, T8, P8), Central left (FC5, FC1 and C3), Central midline (CZ), Central right (FC2, FC6, C6), Parietal left (P1, P3 and P5), Parietal midline (PZ), Parietal right (P2, P4 and P6), Occipital left (O1 and O3), Occipital midline (OZ) and Occipital right (O2 and O4).eAppendix 6. Nested Leave-One-Out Cross-Validation With limited sample size, leave-one-out cross-validation is often used to evaluate the performance of a classifier 8 .This strategy allows more training samples to be used in each iteration and reduces the variability of model performance estimations due to the random partition mechanism in k-fold cross validation 8 .With nested leave-one-out cross-validation, the outer loop contains n folds (where n corresponds to the number of samples in the dataset), and the inner loop contains n-1 folds.Using the training part of each fold in the outer loop,

eAppendix 8 .eFigure 2 .
Feature ImportanceFeature importance was assessed using random permutation in both interval validation and external validation.For internal validation using nested leave-one-out cross-validation with CAN-BIND, the feature values in the testing fold were randomly permuted while keeping the training fold intact.The model was then trained on the training fold and evaluated on the testing fold with permuted values.For each feature, the balanced accuracy was computed and compared to the balanced accuracy obtained with the original data.For external validation with EMBARC, the same process was used, where the feature values in the testing set (in our case, EMBARC dataset) were randomly permuted while keeping the training set intact (in our case, CAN-BIND dataset).In both internal and external validations, this procedure was repeated 100 times and we computed the average reduction in balanced accuracy as a measure of feature importance.eAppendix 9. Results: Participant Characteristics CAN-BIND.Of 211 participants recruited in the CAN-BIND 1 study, 192 completed the baseline visit and received at least one dose of escitalopram.Out of these 192 participants, 180 completed all measures up to week 8.In this study, we analyzed data from 125 participants who were recruited at sites where EEG data were collected (eFigure 2).Demographic and clinical characteristics of participants are summarized in eTables 2 and 3. EMBARC.Of 634 participants enrolled in the EMBARC study 3 , 296 entered the randomization phase and 287 received at least one dose of sertraline or placebo.Out of these 287 participants, 240 completed all measures up to week 8, with 114 in the active arm (sertraline-treated) and 126 in the placebo arm.At baseline, EEG data from seventeen participants were missing or unusable, leaving a total 105 participants in the active arm and 118 participants in the placebo arm for the analysis (eFigure 3).Demographic and clinical characteristics for the participants in the active arm are summarized in eTables 4 and 5. Flow of Participants in CAN-BIND eTable 2. Demographics and Clinical Characteristics in CAN-BIND of Participants Who Completed Week 8 CAMH, Centre for Addiction and Mental Health; QNS, Queen's University; TGH, Toronto General Hospital; UBC, University of British Columbia; MCU, McMaster University; UCA, University of Calgary; MADRS, Montgomery-Åsberg Depression Rating Scale.Note that sites MCU and UCA did not conduct EEG recordings.

eFigure 4 .
Balanced Accuracy in Internal Validation with CAN-BINDThe figures display the balanced accuracy achieved through k-fold cross-validation with CAN-BIND sample for various values of k.The x-axis represents the k values, while the y-axis represents the balanced accuracy expressed in percentage.For each scenario, the balanced accuracy was calculated by averaging 10 iterations of the cross-validation procedure.The error bars depict the standard deviation across the 10 iterations.
at baseline, which corresponds to a MADRS score below 20 (cutoff value for inclusion criteria in CAN-BIND).Out of the 223 participants in EMBARC, 75 in the sertraline arm and 88 in the placebo arm had a baseline HDRS score of 16 or higher.A model trained using CAN-BIND predicted the outcomes in this subgroup of the sertraline arm with a balanced accuracy of 64.1% (95% CI [53.2, 75.0]; sensitivity, 63.4%; specificity, 64.7%) (eTable 8).Once again, the model (A) The plot depicts the balanced accuracy obtained in EMBARC sertraline arm (B) The plot depicts the balanced accuracy obtained EMBARC placebo arm.(A, B) In both plots, x-axes represent the HDRS score cut-off values chosen to exclude EMBARC participants, y-axes the balanced accuracy obtained with the corresponding subgroup of participants.eAppendix 13.Feature Importance The process of randomly permuting the values of each of the 152 features in the external validation of the placebo arm of EMBARC resulted in a reduction in balanced accuracy ranging from -2.9% to 1.5% (eFigure 6).eFigure 6. Feature Importance in External Validation With EMBARC Placebo Arm The figures display the feature importance results obtained in external validation with EMABRC placebo arm, where each matrix corresponds to the importance of power spectral density (PSD), PSD asymmetry, multiscale entropy (MSE), and MSE asymmetry features.The depicted features in pink colors indicate a low level of importance for the model, whereas the features represented in yellow colors indicate a stronger level of importance.Feature importance was evaluated based on the reduction in balanced accuracy in external validation with placebo arm of EMBARC.To enhance the clarity of the visualization, features exhibiting an increase in balanced accuracy were adjusted to 0%, indicating a lack of valuable information for predictive purposes (low level of importance).The display range was set from 0% to 4% to facilitate a more meaningful comparison with Figure 1 in the manuscript.

EEG Amplifier Settings Used in CAN-BIND and EMBARC
The EEG amplifier settings across sites in the CAN-BIND and EMBARC studies are summarized in eTable 1.
eTable 1. CAMH, Centre for Addiction and Mental Health; QNS, Queen's University; TGH, Toronto General Hospital; UBC, University of British Columbia; MCU, McMaster University; UCA, University of Calgary; CU, Columbia University; MG, Massachusetts General Hospital; UM, University of Michigan Ann Arbor; TX, University of Texas Southwestern Medical Center.

Demographics and Clinical Characteristics in CAN-BIND of Participants Who Completed Week 8 and Had EEG Recording
Centre for Addiction and Mental Health; QNS, Queen's University; TGH, Toronto General Hospital; UBC, University of British Columbia; MADRS, Montgomery-Åsberg Depression Rating Scale.